# MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures

This repository is the official implementation of the "MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures" manuscript.

## Table of Contents

- [Requirements](#requirements)
- [Training](#training)
- [Optimization](#optimization)
- [Optimization with the baselines](#optimization-with-the-baselines)
- [Optimization with the CG baseline](#optimization-with-the-cg-baseline)
- [Test set construction](#test-set-construction)
- [Results](#results)
- [RL Environment configuration](#rl-environment-configuration)
- [License](#license)

## Requirements

Python3.9 or Python3.10 is required. 
`git` should be installed, see here for instructions: https://github.com/git-guides/install-git

Then you can install requirements:

```setup
pip install -r requirements.txt
```

You also need to install the AIRSS package by following the instructions here: https://airss-docs.github.io/getting-started/installation/

## Training

To train the MACS policy presented in the paper:

```train
python train.py
```

Optional parameters:

- `comp_config`: the path to the .json file containing the composition/structure size information. The "comp_config.json" is used by default.
- `algo_config`: the path to the .json file containing the RLlib configuration for SAC. The "algo_config.json" is used by default.
- `env_config`: the path to the .json file containing the environment parameters, including features for observations, the reward design choice, and the action design choice. "env_config.json" is used by default.
- `checkpoint`: the path to the policy from which to continue training.
- `num_workers`: the number of additional parallel environments to collect experience. By default, this value is 0, so only one environment is sampled. The paper uses 39, so 40 parallel environments are used simultaneously.
- `iterations`: the number of iterations to train, 25000 is the default value.

Regular checkpoints are saved in the "ray_results" directory.

## Optimization

The MACS policy can be used to optimize a set of structures. All structures in the set should belong to the same composition and have the same the number of atoms. To optimize a test set with MACS:

```eval
python optimize.py
```
Optional parameters:
- `policy`: the policy directory. By default, this is "policy", the directory containing the MACS policy that is used in the paper. If you train your own policy, set the checkpoint you want to use.
- `comp`: the name of the composition/structure size that will be optimized. "SrTiO3x8" is used by default to optimize the structures of SrTiO3 with 40 atoms.
- `comp_config`: the path to the .json file containing the composition/structure size information. This file must contain information about `comp`. The "comp_config.json" is used by default and contains the information about all compositions and structure sizes used in the paper.
- `algo_config`: the path to the .json file containing the RLlib configuration for SAC. The "algo_config.json" is used by default.
- `env_config`: the path to the .json file containing the environment parameters, including features for observations, the reward design choice, and the action design choice. "env_config.json" is used by default.
- `input_dir`: the directory containing starting structures. "starting_structures/SrTiO3x8" is the default value.
- `output_dir`: the directory where the optimization results will be stored. The `output_dir` parameter in the `env_config` file is used as the default value.

The optimized structures will be saved to the "<output_dir>/final_structures" directory. The opt_traj.csv file will contain the evolution of step/energy/time/max forces during optimization for each optimized structure. The statistics will be saved to the "stat.csv" file and contain the mean number of steps in successful optimizations, the mean number of energy calculations in successful optimizations, the mean optimization time, the failure rate, and the mean final energy.

## Optimization with the baselines

To optimize a test set with one of the baselines, you need to run:

```eval
python baseline_optimize.py
```
Optional parameters:
- `method`: the name of the method used for the optimization, the possible names are "BFGS", "BFGSLineSearch" (BFGSLS in the paper), "FIRE", "FIRE+BFGSLineSearch" (FIRE+BFGSLS in the paper), "MDMin", and "SciPyFminCG" (CG in the paper). "BFGSLineSearch" is the default value.
- `input_dir`: the directory containing starting structures. "starting_structures/SrTiO3x8" is the default value.
- `output_dir`: the directory where the optimization results will be stored. "baseline_output" is the default value.

## Optimization with the CG baseline

ASE and SciPy packages do not provide the number of energy calculations for the conjugate gradient method. If you wish to use this method for optimization and log the number of energy calculations, you should replace some files in the code of these libraries with the files provided in the "CG" directory. These files implement logging of the number of energy calculations the same way as native ASE methods do. Use the files in the "CG" directory to replace the following files in the ASE and SciPy packages:

- sciopt.py --> ase/optimize/sciopt.py
- _dcsrch.py --> scipy/optimize/_dcsrch.py
- _linesearch.py --> scipy/optimize/_linesearch.py
- _optimize.py --> scipy/optimize/_optimize.py

## Test set construction

The test set for the SrTiO3 structures with 40 atoms is placed in the "starting_structures/SrTiO3x8" folder. To create your own test set, you need to run:

```eval
python create_starting_structures.py --output_dir=<directory>
```
where <directory> is the directory where the structures will be created.

Optional parameters:
- `comp`: the name of the composition/structure size that you want to create. "SrTiO3x8" by default to create structures of SrTiO3x8 with 40 atoms.
- `comp_config`: the path to the .json file containing the composition/structure size information. This file must contain information about `comp`. The "comp_config.json" is used by default and contains names and information about all composition/structure sizes used in this study.
- `n`: the number of structures to create, 10 by default.

## Results

The "MACS_output" directory contains the opt_traj files for all test sets and the final structures for the test set of SrTiO3 structures with 40 atoms.
To reproduce the results from the paper for a test set one should follow the next steps:

- Construct a test set with 300 structures for a given composition and structure size. The test set of SrTiO3 structures with 40 atoms is stored in the "starting_structures/SrTiO3x8" directory.
- Optimize the test set with MACS and the baselines.
- Compare the metrics in the corresponding statistics files.

## RL Environment configuration

The "env_config.json" file contains the hyperparameters used for the RL environment and defines the design of the observations, rewards, actions. These hyperparameters include:
- `comp` - composition/structure size in format "SrTiO3x2" or "SrTiO3" or a list of compositions in format "SrTiO3x2,Cu28S16". These names are connected to the entries in the "comp_config.json" file.
- `env_name` - the name of the environment. It should be the same for training and optimizing. 
- `max_cycles` - the maximum number of steps in each episode
- `neighbors_limit` - the number of nearest neighbors
- `max_episodes` - the maximum number of sequential episodes optimizing the same structure
- `reward_type` - the reward formula. "log-grad-drop": Eq.5 in the paper;
                                      "log-grad-drop-plus-const": Eq.10 in the paper;
                                      "log-grad-drop-plus-ave": Eq.11 in the paper.
- `step_cost` - penalty which is used in the "log-grad-drop-plus-const" reward
- `use_log_gnorm_feature` - whether to use log of gnorm feature
- `use_grad_feature` - whether to use the gradient vector as a feature
- `use_d_grad_feature` - whether to use the difference between the previous and current gradients as a feature
- `use_last_step_feature` - whether to use the last atom's displacement as a feature
- `cif` - cif file for the starting structure if we do not want to optimize random structures
- `max_grad_value` - g_max
- `stepsize_min` - the lower bound on the components of the action vector
- `stepsize_max` - the upper bound on the components of the action vector
- `variable_step_size` - define the action design. null: Eq.12 in the paper; "gnorm": Eq.4 in the paper
- `varss_c4` - c_max
- `fmax` - the threshold for the maximum forces below which optimization stops.
- `store_trajectory` - whether to store the trajectories of optimization or not
- `output_dir` - the path to the directory to save the trajectories, the energy history, and the final structures

## License

The code is distributed under The Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public (CC BY-NC-SA 4.0) License.
